nlp_architect.models.gnmt.scripts package

Submodules

nlp_architect.models.gnmt.scripts.bleu module

Python implementation of BLEU and smooth-BLEU.

This module provides a Python implementation of BLEU and smooth-BLEU. Smooth BLEU is computed following the method outlined in the paper: Chin-Yew Lin, Franz Josef Och. ORANGE: a method for evaluating automatic evaluation metrics for machine translation. COLING 2004.

nlp_architect.models.gnmt.scripts.bleu.compute_bleu(reference_corpus, translation_corpus, max_order=4, smooth=False)[source]

Computes BLEU score of translated segments against one or more references.

Parameters:
  • reference_corpus – list of lists of references for each translation. Each reference should be tokenized into a list of tokens.
  • translation_corpus – list of translations to score. Each translation should be tokenized into a list of tokens.
  • max_order – Maximum n-gram order to use when computing BLEU score.
  • smooth – Whether or not to apply Lin et al. 2004 smoothing.
Returns:

3-Tuple with the BLEU score, n-gram precisions, geometric mean of n-gram precisions and brevity penalty.

nlp_architect.models.gnmt.scripts.rouge module

ROUGE metric implementation.

Copy from tf_seq2seq/seq2seq/metrics/rouge.py. This is a modified and slightly extended verison of https://github.com/miso-belica/sumy/blob/dev/sumy/evaluation/rouge.py.

nlp_architect.models.gnmt.scripts.rouge.rouge(hypotheses, references)[source]

Calculates average rouge scores for a list of hypotheses and references

nlp_architect.models.gnmt.scripts.rouge.rouge_l_sentence_level(evaluated_sentences, reference_sentences)[source]

Computes ROUGE-L (sentence level) of two text collections of sentences. http://research.microsoft.com/en-us/um/people/cyl/download/papers/ rouge-working-note-v1.3.1.pdf

Calculated according to: R_lcs = LCS(X,Y)/m P_lcs = LCS(X,Y)/n F_lcs = ((1 + beta^2)*R_lcs*P_lcs) / (R_lcs + (beta^2) * P_lcs)

where: X = reference summary Y = Candidate summary m = length of reference summary n = length of candidate summary

Parameters:
  • evaluated_sentences – The sentences that have been picked by the summarizer
  • reference_sentences – The sentences from the referene set
Returns:

F_lcs

Return type:

A float

Raises:

ValueError – raises exception if a param has len <= 0

nlp_architect.models.gnmt.scripts.rouge.rouge_l_summary_level(evaluated_sentences, reference_sentences)[source]

Computes ROUGE-L (summary level) of two text collections of sentences. http://research.microsoft.com/en-us/um/people/cyl/download/papers/ rouge-working-note-v1.3.1.pdf

Calculated according to: R_lcs = SUM(1, u)[LCS<union>(r_i,C)]/m P_lcs = SUM(1, u)[LCS<union>(r_i,C)]/n F_lcs = ((1 + beta^2)*R_lcs*P_lcs) / (R_lcs + (beta^2) * P_lcs)

where: SUM(i,u) = SUM from i through u u = number of sentences in reference summary C = Candidate summary made up of v sentences m = number of words in reference summary n = number of words in candidate summary

Parameters:
  • evaluated_sentences – The sentences that have been picked by the summarizer
  • reference_sentence – One of the sentences in the reference summaries
Returns:

F_lcs

Return type:

A float

Raises:

ValueError – raises exception if a param has len <= 0

nlp_architect.models.gnmt.scripts.rouge.rouge_n(evaluated_sentences, reference_sentences, n=2)[source]

Computes ROUGE-N of two text collections of sentences. Sourece: http://research.microsoft.com/en-us/um/people/cyl/download/ papers/rouge-working-note-v1.3.1.pdf

Parameters:
  • evaluated_sentences – The sentences that have been picked by the summarizer
  • reference_sentences – The sentences from the referene set
  • n – Size of ngram. Defaults to 2.
Returns:

A tuple (f1, precision, recall) for ROUGE-N

Raises:

ValueError – raises exception if a param has len <= 0

Module contents